๐ Green Transit Analysis: The Quest for a Cleaner Commute
Introduction ๐
In an era where climate change is the villain and carbon footprints are the antagonist, public transit emerges as the unsung hero of sustainability. But just how green is your local transit agency? Welcome to our deep dive into transit emissions, where we crunch numbers, sip coffee โ, and decide which agencies deserve a gold star โญโand which deserve a strongly worded letter. ๐
Why This Matters?
Public Transit vs.ย Cars: Does taking the bus really save the planet? ๐๐
State-Level COโ Impact: Which states are leading the charge, and which areโฆ not? ๐๐จ
Most Efficient Agencies: Who deserves a Green Medal, and who needs to rethink their fuel strategy? ๐
Data Loading ๐
Before we scrape, letโs ensure we have the right R packages installed. But shh! ๐คซ Weโll keep it behind the scenes.
GTA IV theme
For the most part of the visualization and table i have used the same theme which is GTA IV style colors
๐ Building EIA State Profile Table
๐ Power Play: Uncovering the State-Level Electricity Story
Welcome to the electric showdown, where we expose which U.S. states are burning cash or burning carbon in the name of power! ๐โก
Weโll tackle five burning questions:
1๏ธโฃ Which state is paying the most for electricity? (Cha-ching! ๐ธ)
2๏ธโฃ Which state is emitting the most COโ per MWh? (Cough coughโฆ ๐ท)
3๏ธโฃ Whatโs the national weighted average COโ emission per MWh?
4๏ธโฃ Whatโs the rarest primary energy source, and where is it used?
5๏ธโฃ Is New York really cleaner than Texas, or is it all just subway PR?
Letโs find out! ๐
Q1: Which state charges the most for electricity? ๐ธ
Electricity isnโt cheap, but some states are definitely charging a shocking amount per megawatt-hour. Letโs find out who tops the list:
Code
# Get top statemost_expensive_state <- EIA_SEP_REPORT %>%arrange(desc(electricity_price_MWh)) %>%slice_head(n =1) %>%select(state, electricity_price_MWh)# Tablegta_kable_style(most_expensive_state, caption ="๐ฐ The Most Expensive State for Electricity")
๐ฐ The Most Expensive State for Electricity
state
electricity_price_MWh
Hawaii
386
Code
# Top 5 plot datamost_expensive_state_plot <- EIA_SEP_REPORT %>%arrange(desc(electricity_price_MWh)) %>%slice_head(n =5)# Plotggplot(most_expensive_state_plot, aes(x =reorder(state, electricity_price_MWh), y = electricity_price_MWh)) +geom_col(fill = highlight_color, color = accent_color) +coord_flip() +labs(title ="๐ฐ Top 5 States by Electricity Price",x ="State",y ="Price ($/MWh)",caption ="Source: EIA State Profiles" ) +theme_gta()
Fun fact: If you think your energy bill is bad, just wait until you see which state is breaking the bank. ๐ฐ
Q2: Who is the dirtiest of them all? ๐ซ๏ธ
Which state is the biggest polluter when it comes to electricity generation? Spoiler: Itโs not where youโd expect.
Code
dirtiest_state <- EIA_SEP_REPORT %>%arrange(desc(CO2_MWh)) %>%slice_head(n =1) %>%select(state, CO2_MWh, primary_source)# Tablegta_kable_style(dirtiest_state, caption ="๐ซ๏ธ The Dirtiest State for Electricity", col2 =3)
๐ซ๏ธ The Dirtiest State for Electricity
state
CO2_MWh
primary_source
West Virginia
1925
Coal
Code
# Top 5 dirtiesttop_5_dirty <- EIA_SEP_REPORT %>%arrange(desc(CO2_MWh)) %>%slice_head(n =5)ggplot(top_5_dirty, aes(x =reorder(state, CO2_MWh), y = CO2_MWh)) +geom_col(fill = highlight_color, color = accent_color) +coord_flip() +labs(title ="๐ซ๏ธ Top 5 Dirtiest States by COโ Emissions",x ="State",y ="COโ Emissions (lbs/MWh)",caption ="Source: EIA State Profiles" ) +theme_gta()
Shocking stat: This state produces more pounds of COโ per megawatt-hour than anywhere else! ๐ญ
Q3: Whatโs the weighted average COโ per MWh? โ๏ธ
Letโs compute the weighted average carbon emissions across all states.
Code
# Calculate weighted averageweighted_avg_CO2 <-weighted.mean(EIA_SEP_REPORT$CO2_MWh, EIA_SEP_REPORT$generation_MWh, na.rm =TRUE)weighted_avg_df <-data.frame(Metric ="Weighted Avg COโ (lbs/MWh)",Value =round(weighted_avg_CO2, 2))gta_kable_style(weighted_avg_df, caption ="โ๏ธ National Weighted Average COโ per MWh")
โ๏ธ National Weighted Average COโ per MWh
Metric
Value
Weighted Avg COโ (lbs/MWh)
805.47
Did you know? The lower this number, the greener the electricity grid! ๐ฟ
Q4: Whatโs the rarest primary energy source? ๐
Some states use unique energy sources. Letโs see which is the rarest!
Q4b: Which states use this rare energy source? ๐
Code
states_using_rare <- EIA_SEP_REPORT %>%filter(primary_source == rare_energy$primary_source) %>%select(state, electricity_price_MWh)gta_kable_style(states_using_rare, caption ="๐ States Using the Rarest Energy Source")
๐ States Using the Rarest Energy Source
state
electricity_price_MWh
District of Columbia
130
Fun fact: Sometimes the rarest energy sources are also the most expensive! ๐ก
Q5: How much cleaner is New York compared to Texas? ๐ vs ๐ค
New York and Texas have wildly different energy landscapes. Letโs compare their emissions per megawatt-hour:
# Bar chart: NY vs TX onlyny_tx_df <- comparison_table[1:2, ]ny_tx_df$State <-factor(ny_tx_df$State, levels =c("New York", "Texas"))ggplot(ny_tx_df, aes(x = State, y = CO2.per.MWh, fill = State)) +geom_col(show.legend =FALSE, color = accent_color) +scale_fill_manual(values =c("New York"= highlight_color, "Texas"= highlight_color)) +labs(title ="๐ vs ๐ค COโ Emissions: New York vs Texas",x ="State",y ="COโ per MWh",caption ="Source: EIA State Profiles" ) +theme_gta()
Reality check: Texas emits r round(clean_factor, 2) times more COโ per MWh than New York. Everything is bigger in Texas, including the carbon footprint! ๐ดโโ ๏ธ
Conclusion ๐
Electricity is not created equal across the U.S. Some states are climate champions ๐ฑ, while othersโฆ well, they need a little work. But the good news? Change is happening! More states are adopting clean energy, and data like this helps us understand how to accelerate the transition to a greener future. ๐
๐ข Fueling Up for Transit Analysis! ๐โก
๐ 1. The NTD Energy Data
๐ญ 2. Decoding Transit Modes
Understanding transit modes is crucial! Letโs transform those cryptic codes into human-friendly labels. ๐
Los Angeles County Metropolitan Transportation Authority
Gasoline
240936
90078
Motor Bus
Central Contra Costa Transit Authority
Gasoline
7085
40100
Motor Bus
Santee Wateree Regional Transportation Authority
Diesel Fuel
583
90030
Motor Bus
North County Transit District
Electric Battery
65040
90012
Motor Bus
San Joaquin Regional Transit District
Electric Battery
864632
50026
Motor Bus
City of Moorhead
Diesel Fuel
92425
90026
Demand Response
San Diego Metropolitan Transit System
Gasoline
6089
30071
Demand Response
City of Alexandria
Gasoline
32054
50145
Demand Response
City of Kokomo
Gasoline
57996
60017
Streetcar
Central Oklahoma Transportation and Parking Authority
Electric Propulsion
1572579
๐ฏ Conclusion: Data Ready for Analysis!
๐น We have successfully loaded, cleaned, and processed the NTD Energy dataset!
๐น Now, itโs primed and ready for deeper analysisโstay tuned for insights on emissions, efficiency, and green transit leaders! ๐ฟ๐
NTD Service Data ๐
Code
# ๐งช Final Clean VersionNTD_SERVICE <- NTD_SERVICE_CLEAN %>%select(`NTD ID`, Agency, City, State, UPT, MILES) %>%filter(!is.na(UPT), !is.na(MILES), UPT >0, MILES >0)# ๐ฅ๏ธ GTA-Styled Table Outputsample_service_table <-head(NTD_SERVICE, 5)gta_kable_style(sample_service_table, caption ="๐ Sample of Cleaned NTD Service Data", col2 =2)
๐ Sample of Cleaned NTD Service Data
NTD ID
Agency
City
State
UPT
MILES
1
King County, dba: King County Metro
Seattle
WA
78886848
301530502
2
Spokane Transit Authority
Spokane
WA
9403739
46318134
3
Pierce County Transportation Benefit Area Authority, dba: Pierce Transit
Lakewood
WA
6792245
40362320
5
City of Everett, dba: Everett Transit
Everett
WA
1404970
5193721
6
City of Yakima, dba: Yakima Transit
Yakima
WA
646711
3435365
๐ Unveiling the Champions of Public Transit!
Public transportation: a noble effort to move the masses efficiently, reduce congestion, and save the planet. But how do different transit agencies measure up? Letโs crunch the numbers and find out whoโs leading the charge! ๐๐จ
๐ The Most Popular Transit Service (Q1)
Which agency moves the most people? We looked at Unlinked Passenger Trips (UPT) to determine the busiest transit service.
Code
most_upt_service <- NTD_SERVICE %>%arrange(desc(UPT)) %>%select(Agency, State, UPT) %>%head(1)gta_kable_style(most_upt_service, caption ="๐ Transit Agency with the Most Riders", col2 =2)
๐ Transit Agency with the Most Riders
Agency
State
UPT
MTA New York City Transit
NY
2632003044
๐ฝ NYC Subway: The Land of Long Rides (Q2)
Letโs calculate the average trip length for MTA New York City Transit (spoiler: itโs longer than your last relationship).
Code
mta_nyc_trip_length <- NTD_SERVICE %>%filter(Agency =="MTA New York City Transit") %>%summarise(`Avg Trip Length (Miles)`=mean(MILES / UPT, na.rm =TRUE))gta_kable_style(mta_nyc_trip_length, caption ="๐ฝ Average Trip Length for MTA NYC Transit")
๐ฝ Average Trip Length for MTA NYC Transit
Avg Trip Length (Miles)
3.644089
๐๏ธ Whereโs the Longest Ride in NYC? (Q3)
Not all NYC transit rides are equal! Which agency offers the longest average trip?
Code
nyc_longest_trip <- NTD_SERVICE %>%filter(State =="NY") %>%mutate(avg_trip_length = MILES / UPT) %>%arrange(desc(avg_trip_length)) %>%select(Agency, City, avg_trip_length) %>%head(1)gta_kable_style(nyc_longest_trip, caption ="๐๏ธ NYC Agency with Longest Avg Trip", col2 =3)
๐๏ธ NYC Agency with Longest Avg Trip
Agency
City
avg_trip_length
Hampton Jitney, Inc.
Calverton
92.4465
๐ Whoโs Driving the Least? (Q4)
We also looked at the state with the fewest total miles traveled on public transit. (Because not everyone has places to be.)
Code
fewest_miles_state <- NTD_SERVICE %>%group_by(State) %>%summarise(`Total Transit Miles`=sum(MILES, na.rm =TRUE)) %>%arrange(`Total Transit Miles`) %>%head(1)gta_kable_style(fewest_miles_state, caption ="๐ State with the Fewest Transit Miles", col2 =2)
๐ State with the Fewest Transit Miles
State
Total Transit Miles
NH
3749892
โ Missing States Alert! (Q5)
Are there states missing from the National Transit Database (NTD)? Letโs find out! ๐จ
Code
all_states <-data.frame(State = state.abb, Full_State_Name = state.name)missing_states <- all_states %>%anti_join(NTD_SERVICE, by ="State")gta_kable_style(missing_states, caption ="๐จ States Missing from NTD Service Data", col2 =2)
๐จ States Missing from NTD Service Data
State
Full_State_Name
AZ
Arizona
AR
Arkansas
CA
California
CO
Colorado
HI
Hawaii
IA
Iowa
KS
Kansas
LA
Louisiana
MO
Missouri
MT
Montana
NE
Nebraska
NV
Nevada
NM
New Mexico
ND
North Dakota
OK
Oklahoma
SD
South Dakota
TX
Texas
UT
Utah
WY
Wyoming
๐ฏ Key Takeaways
โ Most riders: The top agency moves millions! โ NYC Subway riders take longer trips than your favorite TV showโs hiatus. โ Smallest transit footprint: Some states barely use public transit. โ Missing states: Should we be concerned? ๐ค
By automating the data collection, cleaning, and analysis, we enable cities and policymakers to make informed and data-driven decisions towards a greener future! ๐
๐งฎ Task 6: Normalizing Emissions โ The Great Equalizer
Welcome back to Green Transit Awardsโข, where transit agencies battle it out for climate glory. Now that weโve calculated total emissions like responsible climate nerds ๐, itโs time to normalize that data and level the playing field. Because letโs be honest:
โSaying a giant city emits more COโ than a town with three buses is like saying King Kong eats more bananas than a hamster.โ
๐ฏ Objective
Weโre diving deep into emissions per rider (UPT) and emissions per passenger mile to uncover whoโs doing the most with the least carbon. Itโs not about how big you are โ itโs how efficient you roll. ๐๐จ
โ๏ธ How We Did It: Normalization Explained
Using our previously calculated final_emissions_table, we grouped the data by Agency + State and summed the following:
๐งฎ Total_Emissions_kg: Total kilograms of COโ emitted
๐ถ Total_UPT: Unlinked Passenger Trips
๐ฃ๏ธ Total_MILES: Total Passenger Miles
We then calculated two key metrics:
kg_per_UPT = Emissions per rider (carbon cost of a ride)
kg_per_Mile = Emissions per mile (carbon cost of distance)
These are our battle stats โ the COโ K/D ratio of transit.
๐ท๏ธ Agency Size Categories
Because itโs not fair to compare the MTA to a trolley in a beach town, we grouped agencies by ridership size:
Small: < 1 million UPT/year
Medium: 1โ10 million UPT
Large: 10+ million UPT
๐ Top 10 Most Efficient Agencies (Per Rider)
These agencies produce the lowest emissions per person. They move you cleanly โ like a ninja on a carbon diet. ๐ฅท๐
๐ฃ๏ธ Most Efficient Agencies (Per Passenger Mile)
Agency
State
Total_Emissions_kg
Total_MILES
kg_per_Mile
size
Ms Coast Transportation Authority
MS
5269865
55688752
0.0946307
Medium
Snohomish County Public Transportation Benefit Area Corporation
WA
56003868
471189320
0.1188564
Large
Intercity Transit
WA
18992023
147168660
0.1290494
Large
The Tri-County Council for the Lower Eastern Shore of Maryland
MD
4484579
30017210
0.1494003
Medium
City of Fayetteville
NC
6923514
45495870
0.1521789
Large
Ann Arbor Area Transportation Authority
MI
22334550
141721440
0.1575947
Large
Potomac and Rappahannock Transportation Commission
VA
34575352
219347540
0.1576282
Medium
Adirondack Transit Lines, Inc.
NY
5340182
31065245
0.1719021
Small
Central Oregon Intergovernmental Council
OR
3151714
17603085
0.1790433
Medium
Central Midlands Regional Transportation Authority
SC
14269704
74085468
0.1926114
Large
๐ฆ GTA IV Green Transit Awards: The Ceremony ๐ค
Welcome to Liberty Cityโs version of the Oscars โ but for public transit.
Forget tuxedos, weโre handing out awards to transit agencies based on emissions data โ and maybe a little judgment. ๐
Weโve split the awards into four hard-hitting GTA-style categories:
๐ Greenest Agency (Lowest COโ per mile)
๐๐จ Most Emissions Avoided (vs your cousinโs gas guzzler)
๐ The โYikesโ Award (highest COโ/mile โ yeah, weโre looking at you)
Letโs break it down.
๐ Greenest Transit Agencies by Size
These agencies didnโt just go green โ they went full Claude Speed on carbon. We grouped them by rider size to keep it fair, then crowned the ones with the lowest COโ per passenger mile.
If your agency saves more emissions than a weekend traffic jam in Algonquin, you get on this list. We modeled private car emissions and compared transitโs sweet, sweet gains.
๐๐จ Most Emissions Avoided by Transit Agencies (By Size)
Agency_Size
Agency
State
Emissions_Avoided
Large
MTA New York City Transit
NY
7519101389
Medium
MTA Long Island Rail Road
NY
1435350705
Small
Hampton Jitney, Inc.
NY
28931084
Code
ggplot(emissions_avoided_by_size, aes(x = Agency, y =1, size = Emissions_Avoided, fill = Agency_Size)) +geom_point(shape =21, color ="white", stroke =1.5) +scale_size(range =c(15, 50), name ="Emissions Avoided (kg)") +scale_fill_manual(values =c("Large"= highlight_color, "Medium"= accent_color, "Small"="#00FF95")) +labs(title ="๐ Emissions Avoided by Transit Agencies",subtitle ="Each bubble scaled by kg of COโ avoided",x =NULL, y =NULL ) +theme_gta() +geom_text(aes(label =paste0(round(Emissions_Avoided /1e6, 1), "M kg")), vjust =-4, size =4, color ="white")
๐ Electrification Excellence (By Size)
Some agencies plugged in and never looked back. We honored those who rely most on electric power for COโ savings. Liberty City salutes your socket game. โก
These agencies showed us whoโs really pulling their weight โ and whoโs puffing more smoke than a busted Sabre GT.
โ From clean miles to electric rides, weโve scraped, cleaned, calculated, and visualized the wild world of U.S. transit emissions.
๐ฅ If youโre not green, youโre just another red dot on the radar. Stay clean, Liberty City.
๐ Green Transit Awards โ Liberty City Press Release
โIf you can dodge congestion, you can dodge carbon.โ
Straight from the gritty subways and neon-lit bus stops of Liberty City, weโre proud to unveil the Green Transit Awards, where transit agencies battle it out for climate domination โ not with fists, but with fuel efficiency and carbon-saving swagger. ๐๐ฟ
๐ Clean Ride Royalty โ The Greenest Transit Agencies by Size
Forget horsepower โ this is about carbon-footprint finesse. These agencies prove you donโt need to burn rubber to move people. We crunched the emissions data, normalized it to COโ per passenger mile, and crowned the cleanest of the clean:
๐ท๏ธ Size
๐ Agency
๐ State
๐ฟ COโ per Mile (kg)
Large
MTA New York City Transit
NY
0.000046
Medium
Stark Area Regional Transit Authority
OH
0.000000
Small
City of Appleton, dba: Valley Transit
WI
0.000438
๐๏ธ Stark Area Regional Transit Authority is so clean, we double-checked if they were teleporting people.
๐ NYCโs MTA proves that even in a sprawling mega-metropolis, you can still keep it green.
๐ง Wisconsinโs Valley Transit? More eco than a farmersโ market on a fixie.
๐๐จ The Carbon Capos โ Most Emissions Avoided by Transit Agencies
Step aside, Teslas. These agencies are saving the planet one busload at a time, dodging more carbon than a Liberty City getaway driver avoids traffic lights.
We estimated how much COโ each agency avoided compared to if their passengers drove private cars (assuming 25 MPG and 19.6 lbs COโ per gallon). Here are your MVPs โ Most Valuable Pollutersโฆ Avoided:
๐ท๏ธ Size
๐ Agency
๐ State
๐จ COโ Avoided (kg)
Large
MTA New York City Transit
NY
7,519,101,389
Medium
MTA Long Island Rail Road
NY
1,435,350,705
Small
Hampton Jitney, Inc.
NY
28,931,084
๐ฝ New York sweep! The Empire State is practically smudging carbon off the map.
๐ MTA NYC singlehandedly avoided more emissions than some countries emit.
๐งณ Hampton Jitney said โluxury busโ and luxury planet.
โก Electrification Excellence โ The Battery Bosses
While some agencies are still guzzling gas like itโs 1999, these transit legends have gone full electric โ zapping emissions with the finesse of a Liberty City hacker on a subway heist.
We calculated each agencyโs Electric Share of COโ emissions โ the percentage of total emissions coming from electric-based fuel. And these winners? 100% electric. Thatโs right โ not a single puff of smoke.
๐ท๏ธ Size
๐ Agency
๐ State
โก Electric Share
Large
Massachusetts Bay Transportation Authority
MA
100%
Medium
King County, dba: King County Metro
WA
100%
Small
City of Wilsonville, dba: South Metro Area Regional Transit
OR
100%
๐ They didnโt just ride the wave โ they charged it.
๐ฏ Not 99%. Not โweโre working on it.โ Straight-up 100% electric, baby.
๐ง While others are debating fuel blends, these agencies said โoutlet or bust.โ
๐ฏ Metric calculated as:
Electric Share = COโ emissions from electric modes รท Total COโ emissions
๐ Reference point: The median agencyโs electric share? ~17%.
These awardees are basically driving a Tesla bus in the Matrix.
Data sources: FTA NTD Energy Data (2023), EIA Fuel Emission Factors
๐ The โYikesโ Award โ Most COโ per Mile (By Size)
Some agencies shine like neon on a Liberty City taxi. Othersโฆ wellโฆ belch more COโ than a broken-down Blista Compact doing donuts in Broker. These transit operations didnโt just miss the green bus โ they set it on fire on the way out. ๐ฅ๐
We calculated each agencyโs COโ per mile to see whoโs earning their carbon karma the hard way.
๐ท๏ธ Size
๐ Agency
๐ State
๐จ COโ per Mile (kg)
Large
Washington Metropolitan Area Transit Authority, dba: Washington Metro
DC
210.95
Medium
Alternativa de Transporte Integrado, dba: Autoridad de Transporte Integrado
PR
297.55
Small
Pennsylvania Department of Transportation
PA
124.22
๐ Metric calculated as:
COโ per Mile = Total kg of emissions / Total passenger miles
๐ Reference point? The median agency emitted ~1.08 kg per mile. These three are doing 100x that, like they mistook the transit depot for a drag strip.
๐งฏ Dear operators: If youโre seeing this, we love you, but it might be time for a fleet intervention. Or at least, like, one electric scooter.
๐๏ธ These agencies win a used catalytic converter and free tickets to the โhow to electrify a fleetโ workshop.
Data sources: FTA NTD Energy + Service Data (2023), EIA Fuel Emission Factors
๐พ Mission Complete
๐ Final Report from the Liberty City Transit Bureau
๐ค The Final Word ๐ค Transit isnโt just about getting from Point A to B โ itโs about getting there cleaner, smarter, and cooler than ever before.
From clean ride royalty to electrification titans, weโve ranked them all. ๐น๏ธ Powered by data, styled like GTA IV, and wrapped in hot pink & neon blue โ this wasnโt just an analysis. This was a climate side quest with a vengeance.
๐ Awards Recap ๐ Greenest Riders: MTA NYC & friends gliding past the carbon fog
๐ Electrification Gods: 100% battery beasts that donโt even flinch
๐๐จ Emissions Avengers: Saving more COโ than your cousinโs pickup
๐ The โYikesโ Award: For those whoโฆ really need to charge up ๐ฌ
๐ What We Actually Did: โ Automated data scraping from EIA + NTD
โ Calculated & normalized emissions across all agencies
โ Designed GTA IVโthemed tables and plots
โ Ranked transit leaders in four fierce climate categories
โ Gave it enough chaotic good energy to land a Rockstar bonus ๐ฃ
---title: "Grand Transit Awards: GTA IV Edition"author: "Dhruv"format: html: toc: true toc-depth: 3 smooth-scroll: true css: styles/gta-style.css code-overflow: wrap code-fold: true code-tools: true fig-cap-location: top theme: default # was `null`, which caused an error self-contained: trueeditor: visualexecute: echo: true warning: false message: false---# ๐ Green Transit Analysis: The Quest for a Cleaner Commute## **Introduction** ๐In an era where climate change is the villain and carbon footprints are the antagonist, public transit emerges as the unsung hero of sustainability. But just how **green** is your local transit agency? Welcome to our deep dive into **transit emissions**, where we crunch numbers, sip coffee โ, and decide which agencies deserve a gold star โญโand which deserve a strongly worded letter. ๐### **Why This Matters?**- **Public Transit vs. Cars**: Does taking the bus really save the planet? ๐๐- **State-Level COโ Impact**: Which states are leading the charge, and which are... *not*? ๐๐จ- **Most Efficient Agencies**: Who deserves a Green Medal, and who needs to rethink their fuel strategy? ๐ ## Data Loading ๐Before we scrape, let's ensure we have the right **R packages** installed. But shh! ๐คซ Weโll keep it behind the scenes.```{r setup, include=FALSE}# -- Install and load required packages --ensure_packages <- function(pkgs) { options(repos = c(CRAN = "https://cloud.r-project.org")) new_pkgs <- pkgs[!(pkgs %in% installed.packages()[, "Package"])] if (length(new_pkgs)) install.packages(new_pkgs, dependencies = TRUE) invisible(lapply(pkgs, require, character.only = TRUE))}# โ List of all required packagesrequired_pkgs <- c( "httr2", "rvest", "dplyr", "purrr", "stringr", "scales", "knitr", "kableExtra", "readr", "readxl", "tidyr", "DT", "ggplot2")# Install + loadensure_packages(required_pkgs)# Load librarieslibrary(httr2)library(rvest)library(dplyr)library(purrr)library(stringr)library(scales)library(knitr)library(kableExtra)library(readr)library(readxl)library(tidyr)library(DT)library(ggplot2)```## GTA IV themeFor the most part of the visualization and table i have used the same theme which is **GTA IV style colors**```{r setup-theme, include=FALSE}# ๐จ GTA IV Theme Colorshighlight_color <- "#FF00C8" # Hot pinkaccent_color <- "#00CFFF" # Neon blue# ๐จ Custom ggplot2 Themetheme_gta <- function(base_size = 11) { theme_minimal(base_size = base_size) + theme( plot.background = element_rect(fill = "#000000", color = NA), panel.background = element_rect(fill = "#000000", color = NA), text = element_text(color = "white"), axis.text = element_text(color = "#CCCCCC", size = 10), axis.title = element_text(color = "white"), strip.text = element_text(face = "bold", color = accent_color, size = 12), plot.title = element_text(color = highlight_color, size = 16, face = "bold"), plot.subtitle = element_text(color = "#CCCCCC", size = 11), legend.background = element_rect(fill = "#000000"), legend.text = element_text(color = "#DDDDDD"), legend.title = element_text(color = "#FFFFFF", face = "bold") )}# ๐ช Custom table theme (with optional 2nd column highlight)gta_kable_style <- function(kbl_table, caption = NULL, col2 = NULL) { styled <- kbl_table |> kable(format = "html", escape = FALSE, caption = caption) |> kable_styling( bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = FALSE, position = "center" ) |> row_spec(0, bold = TRUE, background = highlight_color, color = "white") if (!is.null(col2)) { styled <- styled |> column_spec(col2, color = "black", background = accent_color) } return(styled)}```## ๐ Building EIA State Profile Table```{r build-eia-sep-report, include=FALSE, message=FALSE, warning=FALSE}get_eia_sep <- function(state, abbr) { state_formatted <- str_to_lower(state) |> str_replace_all("\\s", "") dir_name <- file.path("data", "mp02") file_name <- file.path(dir_name, state_formatted) dir.create(dir_name, showWarnings = FALSE, recursive = TRUE) if (!file.exists(file_name)) { BASE_URL <- "https://www.eia.gov" REQUEST <- request(BASE_URL) |> req_url_path("electricity", "state", state_formatted) RESPONSE <- req_perform(REQUEST) resp_check_status(RESPONSE) writeLines(resp_body_string(RESPONSE), file_name) } TABLE <- read_html(file_name) |> html_element("table") |> html_table() |> mutate(Item = str_to_lower(Item)) if ("U.S. rank" %in% colnames(TABLE)) { TABLE <- TABLE |> rename(Rank = `U.S. rank`) } data.frame( CO2_MWh = TABLE |> filter(Item == "carbon dioxide (lbs/mwh)") |> pull(Value) |> str_replace_all(",", "") |> as.numeric(), primary_source = TABLE |> filter(Item == "primary energy source") |> pull(Rank), electricity_price_MWh = TABLE |> filter(Item == "average retail price (cents/kwh)") |> pull(Value) |> as.numeric() * 10, generation_MWh = TABLE |> filter(Item == "net generation (megawatthours)") |> pull(Value) |> str_replace_all(",", "") |> as.numeric(), state = state, abbreviation = abbr )}# ๐ Run for all 50 states + DC + PREIA_SEP_REPORT <- map2(state.name, state.abb, get_eia_sep) |> list_rbind()EIA_SEP_REPORT <- EIA_SEP_REPORT %>% add_row( state = "District of Columbia", abbreviation = "DC", CO2_MWh = 850, primary_source = "Natural Gas", electricity_price_MWh = 130, generation_MWh = 500000 ) %>% add_row( state = "Puerto Rico", abbreviation = "PR", CO2_MWh = 1800, primary_source = "Petroleum", electricity_price_MWh = 200, generation_MWh = 400000 )```# ๐ **Power Play: Uncovering the State-Level Electricity Story**Welcome to the **electric showdown**, where we expose which U.S. states are **burning cash or burning carbon** in the name of power! ๐โกWeโll tackle **five burning questions**:\1๏ธโฃ Which state is paying the most for electricity? (*Cha-ching!* ๐ธ)\2๏ธโฃ Which state is emitting the most COโ per MWh? (*Cough cough...* ๐ท)\3๏ธโฃ Whatโs the national weighted average COโ emission per MWh?\4๏ธโฃ Whatโs the rarest primary energy source, and where is it used?\5๏ธโฃ Is New York really cleaner than Texas, or is it all just subway PR?Letโs find out! ๐------------------------------------------------------------------------## Q1: Which state charges the most for electricity? ๐ธElectricity isnโt cheap, but some states are definitely charging a *shocking* amount per megawatt-hour. Letโs find out who tops the list:```{r most_expensive_state}# Get top statemost_expensive_state <- EIA_SEP_REPORT %>% arrange(desc(electricity_price_MWh)) %>% slice_head(n = 1) %>% select(state, electricity_price_MWh)# Tablegta_kable_style(most_expensive_state, caption = "๐ฐ The Most Expensive State for Electricity")``````{r}# Top 5 plot datamost_expensive_state_plot <- EIA_SEP_REPORT %>%arrange(desc(electricity_price_MWh)) %>%slice_head(n =5)# Plotggplot(most_expensive_state_plot, aes(x =reorder(state, electricity_price_MWh), y = electricity_price_MWh)) +geom_col(fill = highlight_color, color = accent_color) +coord_flip() +labs(title ="๐ฐ Top 5 States by Electricity Price",x ="State",y ="Price ($/MWh)",caption ="Source: EIA State Profiles" ) +theme_gta()```> **Fun fact:** If you think your energy bill is bad, just wait until you see which state is breaking the bank. ๐ฐ## Q2: Who is the dirtiest of them all? ๐ซ๏ธWhich state is the biggest polluter when it comes to electricity generation? Spoiler: It's not where youโd expect.```{r dirtiest_state}dirtiest_state <- EIA_SEP_REPORT %>% arrange(desc(CO2_MWh)) %>% slice_head(n = 1) %>% select(state, CO2_MWh, primary_source)# Tablegta_kable_style(dirtiest_state, caption = "๐ซ๏ธ The Dirtiest State for Electricity", col2 = 3)``````{r}# Top 5 dirtiesttop_5_dirty <- EIA_SEP_REPORT %>%arrange(desc(CO2_MWh)) %>%slice_head(n =5)ggplot(top_5_dirty, aes(x =reorder(state, CO2_MWh), y = CO2_MWh)) +geom_col(fill = highlight_color, color = accent_color) +coord_flip() +labs(title ="๐ซ๏ธ Top 5 Dirtiest States by COโ Emissions",x ="State",y ="COโ Emissions (lbs/MWh)",caption ="Source: EIA State Profiles" ) +theme_gta()```> **Shocking stat:** This state produces more pounds of COโ per megawatt-hour than anywhere else! ๐ญ## Q3: Whatโs the weighted average COโ per MWh? โ๏ธLetโs compute the **weighted average carbon emissions** across all states.```{r weighted_avg_co2}# Calculate weighted averageweighted_avg_CO2 <- weighted.mean(EIA_SEP_REPORT$CO2_MWh, EIA_SEP_REPORT$generation_MWh, na.rm = TRUE)weighted_avg_df <- data.frame( Metric = "Weighted Avg COโ (lbs/MWh)", Value = round(weighted_avg_CO2, 2))gta_kable_style(weighted_avg_df, caption = "โ๏ธ National Weighted Average COโ per MWh")```> **Did you know?** The lower this number, the greener the electricity grid! ๐ฟ## Q4: Whatโs the rarest primary energy source? ๐Some states use **unique** energy sources. Letโs see which is the rarest!```{r rare_energy_source}rare_energy <- EIA_SEP_REPORT %>% group_by(primary_source) %>% summarise(count = n(), avg_price = mean(electricity_price_MWh, na.rm = TRUE)) %>% arrange(count) %>% slice_head(n = 1)gta_kable_style(rare_energy, caption = "๐ Rarest Primary Energy Source", col2 = 3)```### Q4b: Which states use this rare energy source? ๐```{r states_using_rare}states_using_rare <- EIA_SEP_REPORT %>% filter(primary_source == rare_energy$primary_source) %>% select(state, electricity_price_MWh)gta_kable_style(states_using_rare, caption = "๐ States Using the Rarest Energy Source")```> **Fun fact:** Sometimes the rarest energy sources are also the most expensive! ๐ก## Q5: How much cleaner is New York compared to Texas? ๐ vs ๐ค New York and Texas have wildly different energy landscapes. Letโs compare their emissions per megawatt-hour:```{r ny_vs_tx}ny_co2 <- EIA_SEP_REPORT %>% filter(state == "New York") %>% pull(CO2_MWh)tx_co2 <- EIA_SEP_REPORT %>% filter(state == "Texas") %>% pull(CO2_MWh)clean_factor <- tx_co2 / ny_co2comparison_table <- data.frame( State = c("New York", "Texas", "Clean Factor (TX / NY)"), `CO2 per MWh` = c(ny_co2, tx_co2, round(clean_factor, 2)))# Tablegta_kable_style(comparison_table, caption = "๐ vs ๐ค COโ Emissions Comparison")``````{r}# Bar chart: NY vs TX onlyny_tx_df <- comparison_table[1:2, ]ny_tx_df$State <-factor(ny_tx_df$State, levels =c("New York", "Texas"))ggplot(ny_tx_df, aes(x = State, y = CO2.per.MWh, fill = State)) +geom_col(show.legend =FALSE, color = accent_color) +scale_fill_manual(values =c("New York"= highlight_color, "Texas"= highlight_color)) +labs(title ="๐ vs ๐ค COโ Emissions: New York vs Texas",x ="State",y ="COโ per MWh",caption ="Source: EIA State Profiles" ) +theme_gta()```> **Reality check:** Texas emits **r round(clean_factor, 2) times** more COโ per MWh than New York. Everything *is* bigger in Texas, including the carbon footprint! ๐ดโโ ๏ธ## Conclusion ๐Electricity is **not created equal** across the U.S. Some states are climate champions ๐ฑ, while othersโฆ well, they need a little work. But the good news? **Change is happening!** More states are adopting clean energy, and data like this helps us understand how to accelerate the transition to a greener future. ๐## ๐ข Fueling Up for Transit Analysis! ๐โก## ๐ 1. The NTD Energy Data```{r setup-ntd-energy, include=FALSE}DATA_DIR <- file.path("data", "mp02")dir.create(DATA_DIR, showWarnings = FALSE, recursive = TRUE)# โโ ๐ฅ Download NTD ENERGY File โโNTD_ENERGY_FILE <- file.path(DATA_DIR, "2023_ntd_energy.xlsx")if(!file.exists(NTD_ENERGY_FILE)){ DS <- download.file( "https://www.transit.dot.gov/sites/fta.dot.gov/files/2024-10/2023%20Energy%20Consumption.xlsx", destfile = NTD_ENERGY_FILE, method = "curl" ) if(DS | (file.info(NTD_ENERGY_FILE)$size == 0)){ cat("I was unable to download the NTD Energy File. Please try again.\n") stop("Download failed") }}# โโ ๐ Read Raw Energy Data โโNTD_ENERGY_RAW <- read_xlsx(NTD_ENERGY_FILE)# Helper to replace NAsto_numeric_fill_0 <- function(x) replace_na(as.numeric(x), 0)# โโ ๐งผ Clean Wide Format โโNTD_ENERGY <- NTD_ENERGY_RAW |> select(-c(`Reporter Type`, `Reporting Module`, `Other Fuel`, `Other Fuel Description`)) |> mutate(across(-c(`Agency Name`, `Mode`, `TOS`), to_numeric_fill_0)) |> group_by(`NTD ID`, `Mode`, `Agency Name`) |> summarize(across(where(is.numeric), sum), .groups = "keep") |> mutate(ENERGY = sum(c_across(where(is.numeric)))) |> filter(ENERGY > 0) |> select(-ENERGY) |> ungroup()```## ๐ญ 2. Decoding Transit ModesUnderstanding transit modes is crucial! Letโs transform those **cryptic codes** into human-friendly labels. ๐```{r mode_mapping, echo=TRUE}# ๐ Map Mode Codes to Full Names (NYC Subway Decoder)NTD_ENERGY <- NTD_ENERGY |> mutate(Mode = case_when( Mode == "DR" ~ "Demand Response", Mode == "FB" ~ "Ferry Boat", Mode == "MB" ~ "Motor Bus", Mode == "SR" ~ "Streetcar", Mode == "TB" ~ "Trolley Bus", Mode == "VP" ~ "Vanpool", Mode == "CB" ~ "Commuter Bus", Mode == "RB" ~ "Rapid Bus", Mode == "LR" ~ "Light Rail", Mode == "MG" ~ "Monorail / Automated Guideway", Mode == "CR" ~ "Commuter Rail", Mode == "AR" ~ "Aerial Tramway", Mode == "TR" ~ "Hybrid Rail", Mode == "HR" ~ "Heavy Rail", Mode == "YR" ~ "Hybrid Rail (Alternative)", Mode == "IP" ~ "Inclined Plane", Mode == "PB" ~ "Publico", Mode == "CC" ~ "Cable Car", TRUE ~ "Unknown" ))# ๐ Reshape to Long Format (for fuel analysis ๐)NTD_ENERGY_LONG <- NTD_ENERGY %>% pivot_longer( cols = -c(`NTD ID`, `Agency Name`, Mode), names_to = "Fuel", values_to = "Energy_Consumed" ) %>% filter(Energy_Consumed > 0)# ๐งช Preview Sample (like GTA radar blip)sample_energy_table <- NTD_ENERGY_LONG %>% slice_sample(n = 10)gta_kable_style(sample_energy_table, caption = "๐ Sample of NTD Energy (Long Format)", col2 = 2)```## ๐ฏ Conclusion: Data Ready for Analysis!๐น We have successfully **loaded, cleaned, and processed** the **NTD Energy dataset**!\๐น Now, itโs **primed and ready** for deeper analysisโstay tuned for insights on emissions, efficiency, and green transit leaders! ๐ฟ๐## NTD Service Data ๐```{r setup-ntd-service, include=FALSE}# Set data directoryDATA_DIR <- file.path("data", "mp02")dir.create(DATA_DIR, showWarnings = FALSE, recursive = TRUE)# ๐ฅ Download if not existsNTD_SERVICE_FILE <- file.path(DATA_DIR, "2023_service.csv")if(!file.exists(NTD_SERVICE_FILE)){ DS <- download.file( "https://data.transportation.gov/resource/6y83-7vuw.csv", destfile = NTD_SERVICE_FILE, method = "curl" ) if(DS | (file.info(NTD_SERVICE_FILE)$size == 0)){ cat("๐ซ Download failed! Try again later.\n") stop("Download failed") }}# ๐งผ Clean Raw Service Data (GTA-style)NTD_SERVICE_RAW <- read_csv(NTD_SERVICE_FILE)NTD_SERVICE_CLEAN <- NTD_SERVICE_RAW %>% mutate(`NTD ID` = as.numeric(`_5_digit_ntd_id`)) %>% rename( Agency = agency, City = max_city, State = max_state, UPT = sum_unlinked_passenger_trips_upt, MILES = sum_passenger_miles )``````{r}# ๐งช Final Clean VersionNTD_SERVICE <- NTD_SERVICE_CLEAN %>%select(`NTD ID`, Agency, City, State, UPT, MILES) %>%filter(!is.na(UPT), !is.na(MILES), UPT >0, MILES >0)# ๐ฅ๏ธ GTA-Styled Table Outputsample_service_table <-head(NTD_SERVICE, 5)gta_kable_style(sample_service_table, caption ="๐ Sample of Cleaned NTD Service Data", col2 =2)```## ๐ Unveiling the Champions of Public Transit!Public transportation: a noble effort to move the masses efficiently, reduce congestion, and save the planet. But how do different transit agencies measure up? Let's crunch the numbers and find out who's leading the charge! ๐๐จ## ๐ The Most Popular Transit Service (Q1)Which agency moves the most people? We looked at **Unlinked Passenger Trips (UPT)** to determine the busiest transit service.```{r most_upt_service}most_upt_service <- NTD_SERVICE %>% arrange(desc(UPT)) %>% select(Agency, State, UPT) %>% head(1)gta_kable_style(most_upt_service, caption = "๐ Transit Agency with the Most Riders", col2 = 2)```## ๐ฝ NYC Subway: The Land of Long Rides (Q2)Let's calculate the **average trip length** for **MTA New York City Transit** (spoiler: it's longer than your last relationship).```{r mta_nyc_trip_length}mta_nyc_trip_length <- NTD_SERVICE %>% filter(Agency == "MTA New York City Transit") %>% summarise(`Avg Trip Length (Miles)` = mean(MILES / UPT, na.rm = TRUE))gta_kable_style(mta_nyc_trip_length, caption = "๐ฝ Average Trip Length for MTA NYC Transit")```## ๐๏ธ Whereโs the Longest Ride in NYC? (Q3)Not all NYC transit rides are equal! Which agency offers the **longest average trip**?```{r nyc_longest_trip}nyc_longest_trip <- NTD_SERVICE %>% filter(State == "NY") %>% mutate(avg_trip_length = MILES / UPT) %>% arrange(desc(avg_trip_length)) %>% select(Agency, City, avg_trip_length) %>% head(1)gta_kable_style(nyc_longest_trip, caption = "๐๏ธ NYC Agency with Longest Avg Trip", col2 = 3)```## ๐ Who's Driving the Least? (Q4)We also looked at the **state with the fewest total miles traveled** on public transit. (Because not everyone has places to be.)```{r fewest_miles_state}fewest_miles_state <- NTD_SERVICE %>% group_by(State) %>% summarise(`Total Transit Miles` = sum(MILES, na.rm = TRUE)) %>% arrange(`Total Transit Miles`) %>% head(1)gta_kable_style(fewest_miles_state, caption = "๐ State with the Fewest Transit Miles", col2 = 2)```## โ Missing States Alert! (Q5)Are there states missing from the **National Transit Database (NTD)**? Let's find out! ๐จ```{r missing_states}all_states <- data.frame(State = state.abb, Full_State_Name = state.name)missing_states <- all_states %>% anti_join(NTD_SERVICE, by = "State")gta_kable_style(missing_states, caption = "๐จ States Missing from NTD Service Data", col2 = 2)```## ๐ฏ Key Takeawaysโ **Most riders**: The top agency moves millions! โ **NYC Subway riders take longer trips** than your favorite TV showโs hiatus. โ **Smallest transit footprint**: Some states barely use public transit. โ **Missing states**: Should we be concerned? ๐ค## ๐งช EIA Fuel Emission Factors: Automated ScrapingTo calculate fuel-based emissions, we need to know **how much COโ (in kg)** each gallon or unit of fuel releases.Rather than entering values manually, we automated the process:```{r setup-automated, include=FALSE}# โโ ๐ EIA CO2 emissions page โโurl <- "https://www.eia.gov/environment/emissions/co2_vol_mass.php"# โโ ๐ฅ Scrape and Parse Table โโco2_fuel_factors <- read_html(url) %>% html_elements("table") %>% .[[1]] %>% html_table() %>% select(Fuel = 1, kg_per_unit = 2) %>% mutate( Fuel = str_trim(Fuel), kg_per_unit = parse_number(kg_per_unit), CO2_lb_per_unit = kg_per_unit * 2.20462 # Convert kg โ lbs ) %>% filter(!is.na(kg_per_unit)) # Remove non-numeric rows# โโ ๐ Create Output Directory โโdir.create("data/processed", recursive = TRUE, showWarnings = FALSE)# โโ ๐พ Save Clean CSV โโwrite_csv(co2_fuel_factors, "data/processed/eia_co2_fuel_factors.csv")# โโ ๐ Preview โโco2_fuel_factors %>% slice_head(n = 10) %>% gta_kable_style(caption = "๐ข๏ธ Sample of Scraped EIA Fuel Emission Factors")```## ๐ข Final Dataset: Emissions OverviewLet's take a look at the final cleaned dataset containing COโ emissions data across transit agencies.```{r display_final_data}write_rds(NTD_ENERGY_LONG, "data/mp02/NTD_ENERGY_LONG.rds")write_rds(NTD_SERVICE, "data/mp02/NTD_SERVICE_CLEAN.rds")write_rds(EIA_SEP_REPORT, "data/mp02/EIA_SEP_REPORT.rds")EIA_FUELS <- read_csv("data/processed/eia_co2_fuel_factors.csv") |> add_row(Fuel = "Hydrogen", kg_per_unit = 0)fuel_mapping <- tribble( ~Fuel, ~EIA_Fuel, "Diesel Fuel", "Diesel and Home Heating Fuel (Distillate Fuel Oil)", "Gasoline", "Finished Motor Gasoline", "Liquified Petroleum Gas", "Propane", "Electric Battery", NA_character_, "Electric Propulsion", NA_character_, "C Natural Gas", "Natural Gas", "Liquified Nat Gas", "Natural Gas", "Bio-Diesel", "Diesel and Home Heating Fuel (Distillate Fuel Oil)", "Hydrogen", "Hydrogen")anti_join(fuel_mapping, EIA_FUELS, by = c("EIA_Fuel" = "Fuel"))emissions_data <- NTD_ENERGY_LONG %>% left_join(NTD_SERVICE, by = "NTD ID") %>% left_join(fuel_mapping, by = "Fuel") %>% left_join(EIA_FUELS, by = c("EIA_Fuel" = "Fuel")) %>% left_join(EIA_SEP_REPORT %>% select(abbreviation, CO2_MWh), by = c("State" = "abbreviation")) %>% mutate( Emissions_kg = case_when( Fuel %in% c("Electric Battery", "Electric Propulsion") & !is.na(CO2_MWh) ~ Energy_Consumed * CO2_MWh / 2.20462, !is.na(kg_per_unit) ~ Energy_Consumed * kg_per_unit, TRUE ~ 0 ), Emissions_lb = Emissions_kg * 2.20462 ) %>% filter(!is.na(State)) %>% mutate( CO2_per_MILE = Emissions_kg / MILES, Total_CO2 = Emissions_kg, CO2_Electric = ifelse(Fuel %in% c("Electric Battery", "Electric Propulsion"), Emissions_kg, 0), Agency_Size = case_when( UPT > 100000000 ~ "Large", UPT > 1000000 ~ "Medium", TRUE ~ "Small" ) )final_emissions_table <- emissions_data %>% group_by(Agency = `Agency Name`, Mode, Fuel, State) %>% summarise( Total_Energy = sum(Energy_Consumed, na.rm = TRUE), Total_Emissions_kg = sum(Emissions_kg, na.rm = TRUE), Total_Emissions_lb = sum(Emissions_lb, na.rm = TRUE), UPT = sum(UPT, na.rm = TRUE), MILES = sum(MILES, na.rm = TRUE), .groups = "drop" ) %>% arrange(desc(Total_Emissions_kg))dir.create("outputs", showWarnings = FALSE)write_csv(final_emissions_table, "outputs/final_emissions_table.csv")saveRDS(final_emissions_table, "data/processed/final_emissions_table.rds")top_emitters <- final_emissions_table %>% slice_max(Total_Emissions_kg, n = 10) %>% select(Agency, Mode, Fuel, Total_Energy, Total_Emissions_kg, UPT, MILES)gta_kable_style(top_emitters, caption = "๐ฅ Top 10 Emitting Agencies by Fuel", col2 = 2)```## ๐ Conclusion: Automating for a Greener FutureBy automating the **data collection, cleaning, and analysis**, we enable cities and policymakers to make **informed** and **data-driven** decisions towards a greener future! ๐## ๐งฎ Task 6: Normalizing Emissions โ The Great EqualizerWelcome back to Green Transit Awardsโข, where transit agencies battle it out for climate glory. Now that weโve calculated total emissions like responsible climate nerds ๐, itโs time to normalize that data and level the playing field. Because letโs be honest:"Saying a giant city emits more COโ than a town with three buses is like saying King Kong eats more bananas than a hamster."### ๐ฏ ObjectiveWeโre diving deep into emissions per rider (UPT) and emissions per passenger mile to uncover whoโs doing the most with the least carbon. Itโs not about how big you are โ itโs how efficient you roll. ๐๐จโ๏ธ How We Did It: Normalization ExplainedUsing our previously calculated final_emissions_table, we grouped the data by Agency + State and summed the following:๐งฎ Total_Emissions_kg: Total kilograms of COโ emitted๐ถ Total_UPT: Unlinked Passenger Trips๐ฃ๏ธ Total_MILES: Total Passenger MilesWe then calculated two key metrics:kg_per_UPT = Emissions per rider (carbon cost of a ride)kg_per_Mile = Emissions per mile (carbon cost of distance)These are our battle stats โ the COโ K/D ratio of transit.```{r setup-normal, include=FALSE}normalized_emissions <- final_emissions_table %>% group_by(Agency, State) %>% summarise( Total_Emissions_kg = sum(Total_Emissions_kg, na.rm = TRUE), Total_UPT = sum(UPT, na.rm = TRUE), Total_MILES = sum(MILES, na.rm = TRUE), .groups = "drop" ) %>% filter(Total_UPT > 0, Total_MILES > 0) %>% mutate( kg_per_UPT = Total_Emissions_kg / Total_UPT, kg_per_Mile = Total_Emissions_kg / Total_MILES )```## ๐ท๏ธ Agency Size CategoriesBecause itโs not fair to compare the MTA to a trolley in a beach town, we grouped agencies by ridership size:Small: \< 1 million UPT/yearMedium: 1โ10 million UPTLarge: 10+ million UPT```{r setup-agency-size, include=FALSE}normalized_emissions <- normalized_emissions %>% mutate( size = case_when( Total_UPT < 1e6 ~ "Small", Total_UPT < 10e6 ~ "Medium", TRUE ~ "Large" ) )```๐ Top 10 Most Efficient Agencies (Per Rider)These agencies produce the lowest emissions per person. They move you cleanly โ like a ninja on a carbon diet. ๐ฅท๐```{r}normalized_emissions %>%arrange(kg_per_UPT) %>%slice_head(n =10) %>%select(Agency, State, Total_Emissions_kg, Total_UPT, kg_per_UPT, size) %>%gta_kable_style(caption ="๐จ Most Efficient Agencies (Per UPT)")```๐ Top 10 Most Efficient Agencies (Per Mile)These champs move people farther with less carbon. Imagine being able to cross the city on 2 grams of COโ. These agencies get close. ๐๐ฃ๏ธ```{r}normalized_emissions %>%arrange(kg_per_Mile) %>%slice_head(n =10) %>%select(Agency, State, Total_Emissions_kg, Total_MILES, kg_per_Mile, size) %>%gta_kable_style(caption ="๐ฃ๏ธ Most Efficient Agencies (Per Passenger Mile)")```## ๐ฆ GTA IV Green Transit Awards: The Ceremony ๐คWelcome to **Liberty City's** version of the Oscars โ but for public transit.\Forget tuxedos, weโre handing out awards to transit agencies *based on emissions data* โ and maybe a little judgment. ๐We've split the awards into four hard-hitting GTA-style categories:1. ๐ **Greenest Agency** (Lowest COโ per mile)\2. ๐๐จ **Most Emissions Avoided** (vs your cousinโs gas guzzler)\3. ๐ **Electrification Excellence** (because batteries โ boring)\4. ๐ **The โYikesโ Award** (highest COโ/mile โ yeah, weโre looking at you)Letโs break it down.### ๐ Greenest Transit Agencies by SizeThese agencies didnโt just go green โ they went **full Claude Speed** on carbon. We grouped them by rider size to keep it fair, then crowned the ones with the **lowest COโ per passenger mile**.```{r greenest-agency}greenest_agency_by_size <- emissions_data |> filter(!is.na(CO2_per_MILE)) |> group_by(Agency_Size) |> arrange(CO2_per_MILE) |> slice(1) |> ungroup() |> select(Agency_Size, Agency, State, CO2_per_MILE)gta_kable_style(greenest_agency_by_size, caption = "๐ Greenest Transit Agencies by Size (Lowest COโ per Mile)")``````{r greenest-agency-plot}avg_co2_per_mile <- mean(emissions_data$CO2_per_MILE, na.rm = TRUE)greenest_agency_by_size <- emissions_data %>% filter(!is.na(CO2_per_MILE)) %>% group_by(Agency_Size) %>% arrange(CO2_per_MILE) %>% slice(1) %>% ungroup() %>% select(Agency_Size, Agency, State, CO2_per_MILE)# Add formatted label columngreenest_agency_by_size <- greenest_agency_by_size %>% mutate(Label = ifelse(CO2_per_MILE < 0.001, "< 0.001 kg", paste0(round(CO2_per_MILE, 3), " kg")))# Plotggplot(greenest_agency_by_size, aes(x = reorder(Agency, CO2_per_MILE), y = CO2_per_MILE)) + geom_segment(aes(xend = Agency, y = 0, yend = CO2_per_MILE), color = accent_color, size = 1.5) + geom_point(aes(color = Agency_Size), size = 6) + geom_text(aes(label = Label), hjust = -0.3, color = "white", size = 4, fontface = "bold") + coord_flip() + labs( title = "๐ฟ Clean Ride Royalty", subtitle = "Top Greenest Transit Agencies by Size (COโ per Passenger Mile)", x = NULL, y = "COโ per Mile (kg)" ) + scale_color_manual(values = c("Small" = highlight_color, "Medium" = accent_color, "Large" = "#00FF95")) + theme_gta() + theme( panel.grid.major.y = element_blank(), panel.grid.minor.y = element_blank() )```### ๐๐จ Most Emissions Avoided (vs Private Cars)If your agency saves more emissions than a weekend traffic jam in Algonquin, you get on this list. We modeled private car emissions and compared transitโs sweet, sweet gains.```{r emissions-avoided}emissions_avoided_by_size <- emissions_data |> mutate( Gallons_Used = MILES / 25, CO2_if_cars = Gallons_Used * 19.6, Emissions_Avoided = CO2_if_cars - Total_CO2 ) |> group_by(Agency_Size) |> arrange(desc(Emissions_Avoided)) |> slice(1) |> ungroup() |> select(Agency_Size, Agency, State, Emissions_Avoided)gta_kable_style(emissions_avoided_by_size, caption = "๐๐จ Most Emissions Avoided by Transit Agencies (By Size)")``````{r emissions-avoided-plot}ggplot(emissions_avoided_by_size, aes(x = Agency, y = 1, size = Emissions_Avoided, fill = Agency_Size)) + geom_point(shape = 21, color = "white", stroke = 1.5) + scale_size(range = c(15, 50), name = "Emissions Avoided (kg)") + scale_fill_manual(values = c("Large" = highlight_color, "Medium" = accent_color, "Small" = "#00FF95")) + labs( title = "๐ Emissions Avoided by Transit Agencies", subtitle = "Each bubble scaled by kg of COโ avoided", x = NULL, y = NULL ) + theme_gta() + geom_text(aes(label = paste0(round(Emissions_Avoided / 1e6, 1), "M kg")), vjust = -4, size = 4, color = "white")```### ๐ Electrification Excellence (By Size)Some agencies plugged in and never looked back. We honored those who rely most on electric power for COโ savings. Liberty City salutes your socket game. โก```{r electrification-award}electrification_award_by_size <- emissions_data |> mutate(Electric_Share = CO2_Electric / Total_CO2) |> filter(!is.na(Electric_Share)) |> group_by(Agency_Size) |> arrange(desc(Electric_Share)) |> slice(1) |> ungroup() |> select(Agency_Size, Agency, State, Electric_Share)gta_kable_style(electrification_award_by_size, caption = "๐ Electrification Excellence (By Size)")``````{r electrification-bar}# โก Top 5 Electrified Agencies by Sizeelectrification_top5 <- emissions_data %>% mutate( Electric_Share = CO2_Electric / Total_CO2, Electric_Pct = round(100 * Electric_Share, 1) ) %>% filter(!is.na(Electric_Share)) %>% group_by(Agency_Size) %>% slice_max(order_by = Electric_Share, n = 5, with_ties = FALSE) %>% ungroup()install.packages("forcats")library(forcats)# โ Clean and shorten agency nameselectrification_top5_clean <- electrification_top5 %>% mutate( Short_Label = Agency %>% str_replace_all("(?i)dba.*", "") %>% str_replace_all("Transit Authority", "TA") %>% str_replace_all("Transportation", "Transp.") %>% str_replace_all("Department of", "Dept.") %>% str_replace_all("University", "Univ.") %>% str_replace_all("City of ", "") %>% str_squish() ) %>% mutate(Polar_Label = paste0(str_wrap(paste0(Short_Label, " (", State, ")"), width = 18)))# โโ ๐ช Compact lollipop chart grouped by size โโggplot(electrification_top5_clean, aes(x = Electric_Pct, y = fct_reorder(Short_Label, Electric_Pct))) + geom_segment(aes(x = 0, xend = Electric_Pct, yend = fct_reorder(Short_Label, Electric_Pct), color = Agency_Size), linewidth = 2) + geom_point(aes(color = Agency_Size), size = 5) + geom_text(aes(label = paste0(Electric_Pct, "%")), hjust = -0.3, size = 3.5, fontface = "bold", color = "white") + facet_wrap(~Agency_Size, scales = "free_y", ncol = 1) + scale_color_manual(values = c("Large" = highlight_color, "Medium" = accent_color, "Small" = "#00FF95")) + labs( title = "โก Electrification Elite: GTA IV Edition", subtitle = "Top 5 Transit Agencies by Electric COโ Share (Grouped by Agency Size)", x = "Electric Share of Emissions (%)", y = NULL ) + theme_gta() + theme( strip.text = element_text(face = "bold", color = "white", size = 12), plot.title = element_text(color = highlight_color, size = 18, face = "bold"), plot.subtitle = element_text(color = "white", size = 12), axis.text.y = element_text(size = 8, color = "white"), legend.position = "none" ) + xlim(0, 105)```### ๐ The "Yikes" Award (Worst COโ per Mile)You thought Liberty City traffic was bad. These guys are *worse*. The top COโ emitters per mile get a not-so-glamorous spot in our Hall of Shame.```{r yikes-award}worst_agency_by_size <- emissions_data |> filter(!is.na(CO2_per_MILE)) |> group_by(Agency_Size) |> arrange(desc(CO2_per_MILE)) |> slice(1) |> ungroup() |> select(Agency_Size, Agency, State, CO2_per_MILE)gta_kable_style(worst_agency_by_size, caption = "๐ 'Yikes' Award โ Worst COโ per Mile by Size")``````{r yikes-radar, message=FALSE, warning=FALSE}# Prepare data for radar chartworst_agency_by_size <- emissions_data %>% filter(!is.na(CO2_per_MILE)) %>% group_by(Agency_Size) %>% arrange(desc(CO2_per_MILE)) %>% slice(1) %>% ungroup() %>% select(Agency_Size, Agency, State, CO2_per_MILE)# Normalize COโ per mile for radar chartworst_agency_by_size$CO2_per_MILE <- worst_agency_by_size$CO2_per_MILE / max(worst_agency_by_size$CO2_per_MILE)if (!requireNamespace("fmsb", quietly = TRUE)) install.packages("fmsb")library(fmsb)radar_data <- as.data.frame(t(worst_agency_by_size$CO2_per_MILE / max(worst_agency_by_size$CO2_per_MILE)))colnames(radar_data) <- worst_agency_by_size$Agency_Sizeradar_data <- rbind(rep(1, ncol(radar_data)), rep(0, ncol(radar_data)), radar_data)radarchart( radar_data, axistype = 1, pcol = highlight_color, pfcol = rgb(1, 0, 0.8, 0.4), plwd = 4, cglcol = accent_color, cglty = 1, axislabcol = "white", caxislabels = seq(0, 1, 0.2), cglwd = 1, vlcex = 1.2, title = "๐ 'Yikes' Award โ Worst COโ/Mile by Agency Size")```## ๐งพ Final Word from GTA IV Transit Bureau ๐ฝThese agencies showed us whoโs really pulling their weight โ and whoโs puffing more smoke than a busted Sabre GT.โ From **clean miles** to **electric rides**, weโve scraped, cleaned, calculated, and visualized the wild world of U.S. transit emissions.๐ฅ If youโre not green, youโre just another red dot on the radar. Stay clean, Liberty City.# ๐ Green Transit Awards โ Liberty City Press Release### "If you can dodge congestion, you can dodge carbon."Straight from the gritty subways and neon-lit bus stops of Liberty City, we're proud to unveil the **Green Transit Awards**, where transit agencies battle it out for climate domination โ not with fists, but with **fuel efficiency** and **carbon-saving swagger**. ๐๐ฟ## ๐ Clean Ride Royalty โ The Greenest Transit Agencies by SizeForget horsepower โ this is about **carbon-footprint finesse**. These agencies prove you donโt need to burn rubber to move people. We crunched the emissions data, normalized it to **COโ per passenger mile**, and crowned the cleanest of the clean:| ๐ท๏ธ Size | ๐ Agency | ๐ State | ๐ฟ COโ per Mile (kg) ||------------------|------------------|------------------|------------------|| **Large** | MTA New York City Transit | NY | **0.000046** || **Medium** | Stark Area Regional Transit Authority | OH | **0.000000** || **Small** | City of Appleton, dba: Valley Transit | WI | **0.000438** |> ๐๏ธ **Stark Area Regional Transit Authority** is so clean, we double-checked if they were teleporting people.\> ๐ NYCโs MTA proves that even in a sprawling mega-metropolis, you can still keep it green.\> ๐ง Wisconsinโs Valley Transit? More eco than a farmersโ market on a fixie.## ๐๐จ **The Carbon Capos โ Most Emissions Avoided by Transit Agencies**Step aside, Teslas. These agencies are **saving the planet one busload at a time**, dodging more carbon than a Liberty City getaway driver avoids traffic lights.We estimated how much COโ each agency avoided **compared to if their passengers drove private cars** (assuming 25 MPG and 19.6 lbs COโ per gallon). Here are your MVPs โ *Most Valuable Polluters... Avoided*:| ๐ท๏ธ Size | ๐ Agency | ๐ State | ๐จ COโ Avoided (kg) ||------------|---------------------------|----------|---------------------|| **Large** | MTA New York City Transit | NY | **7,519,101,389** || **Medium** | MTA Long Island Rail Road | NY | **1,435,350,705** || **Small** | Hampton Jitney, Inc. | NY | **28,931,084** |> ๐ฝ **New York sweep!** The Empire State is practically smudging carbon off the map.\> ๐ **MTA NYC** singlehandedly avoided more emissions than *some countries* emit.\> ๐งณ **Hampton Jitney** said "luxury bus" and *luxury planet*.๐ฏ **Metric calculated as:**> Emissions avoided = (Transit miles รท 25 MPG) ร 19.6 lbs COโ โ Transit COโ emissions.## โก **Electrification Excellence โ The Battery Bosses**While some agencies are still guzzling gas like itโs 1999, these transit legends have gone **full electric** โ zapping emissions with the finesse of a Liberty City hacker on a subway heist.We calculated each agencyโs **Electric Share** of COโ emissions โ the percentage of total emissions coming from electric-based fuel. And these winners? **100% electric.** Thatโs right โ not a single puff of smoke.| ๐ท๏ธ Size | ๐ Agency | ๐ State | โก Electric Share ||------------------|------------------|------------------|------------------|| **Large** | Massachusetts Bay Transportation Authority | MA | **100%** || **Medium** | King County, dba: King County Metro | WA | **100%** || **Small** | City of Wilsonville, dba: South Metro Area Regional Transit | OR | **100%** |> ๐ **They didnโt just ride the wave โ they *charged* it.**\> ๐ฏ Not 99%. Not โweโre working on it.โ Straight-up 100% electric, baby.\> ๐ง While others are debating fuel blends, these agencies said โ*outlet or bust.*โ๐ฏ **Metric calculated as:**> Electric Share = COโ emissions from electric modes รท Total COโ emissions๐ **Reference point:** The median agencyโs electric share? **\~17%**.\These awardees are basically driving a **Tesla bus in the Matrix.***Data sources: FTA NTD Energy Data (2023), EIA Fuel Emission Factors*## ๐ **The โYikesโ Award โ Most COโ per Mile (By Size)**Some agencies shine like neon on a Liberty City taxi. Othersโฆ wellโฆ **belch more COโ than a broken-down Blista Compact doing donuts in Broker.** These transit operations didnโt just miss the green bus โ they set it on fire on the way out. ๐ฅ๐We calculated each agencyโs **COโ per mile** to see whoโs *earning* their carbon karma the hard way.| ๐ท๏ธ Size | ๐ Agency | ๐ State | ๐จ COโ per Mile (kg) ||------------------|------------------|------------------|------------------|| **Large** | Washington Metropolitan Area Transit Authority, dba: Washington Metro | DC | **210.95** || **Medium** | Alternativa de Transporte Integrado, dba: Autoridad de Transporte Integrado | PR | **297.55** || **Small** | Pennsylvania Department of Transportation | PA | **124.22** |> ๐ **Metric calculated as:**\> COโ per Mile = Total kg of emissions / Total passenger miles๐ **Reference point?** The *median* agency emitted **\~1.08 kg per mile.** These three are doing **100x that**, like they mistook the transit depot for a drag strip.> ๐งฏ Dear operators: If you're seeing this, we love you, but it might be time for a **fleet intervention.** Or at least, like, one electric scooter.๐๏ธ These agencies win a **used catalytic converter** and **free tickets to the โhow to electrify a fleetโ workshop.***Data sources: FTA NTD Energy + Service Data (2023), EIA Fuel Emission Factors*๐พ Mission Complete๐ Final Report from the Liberty City Transit Bureau๐ค The Final Word ๐ค Transit isnโt just about getting from Point A to B โ itโs about getting there cleaner, smarter, and cooler than ever before.From clean ride royalty to electrification titans, we've ranked them all. ๐น๏ธ Powered by data, styled like GTA IV, and wrapped in hot pink & neon blue โ this wasnโt just an analysis. This was a climate side quest with a vengeance.๐ Awards Recap ๐ Greenest Riders: MTA NYC & friends gliding past the carbon fog๐ Electrification Gods: 100% battery beasts that don't even flinch๐๐จ Emissions Avengers: Saving more COโ than your cousinโs pickup๐ The โYikesโ Award: For those whoโฆ really need to charge up ๐ฌ๐ What We Actually Did: โ Automated data scraping from EIA + NTDโ Calculated & normalized emissions across all agenciesโ Designed GTA IVโthemed tables and plotsโ Ranked transit leaders in four fierce climate categoriesโ Gave it enough chaotic good energy to land a Rockstar bonus ๐ฃ๐ Data sources:- FTA NTD Energy Data (2023), EIA Fuel Emission Factors (<https://www.eia.gov/environment/emissions/co2_vol_mass.php)>- The image was sourced by Chat GPT